The UCREL Semantic Analysis System

نویسندگان

  • Paul Rayson
  • Dawn Archer
  • Scott Piao
  • Tony McEnery
چکیده

The UCREL semantic analysis system (USAS) is a software tool for undertaking the automatic semantic analysis of English spoken and written data. This paper describes the software system, and the hierarchical semantic tag set containing 21 major discourse fields and 232 fine-grained semantic field tags. We discuss the manually constructed lexical resources on which the system relies, and the seven disambiguation methods including part-of-speech tagging, general likelihood ranking, multi-word-expression extraction, domain of discourse identification, and contextual rules. We report an evaluation of the accuracy of the system compared to a manually tagged test corpus on which the USAS software obtained a precision value of 91%. Finally, we make reference to the applications of the system in corpus linguistics, content analysis, software engineering, and electronic dictionaries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing the UCREL Semantic Annotation Scheme with Lexicographical Taxonomies

Annotation schemes for semantic field analysis use abstract concepts to classify words and phrases in a given text. The use of such schemes within lexicography is increasing. Indeed, our own UCREL semantic annotation system (USAS) is to form part of a web-based ‘intelligent’ dictionary (Herpiö 2002). As USAS was originally designed to enable automatic content analysis (Wilson and Rayson 1993), ...

متن کامل

Using the Ucrel Automated Semantic Analysis System to Investigate Differing Concerns in Refugee Literature

Forced Migration Online (FMO) provides instant access to a wide variety of online resources dealing with forced migrants, and their plight worldwide. The online International Thesaurus of Refugee Terminology (ITRT), in turn, provides users with refugee-related terminology in three languages (English, French and Spanish). The FMO data can already be searched thematically as well as regionally. T...

متن کامل

Developing an automated semantic analysis system for Early Modern English

As reported by Wilson and Rayson (1993) and Rayson and Wilson (1996), the UCREL semantic analysis system (USAS) has been designed to undertake the automatic semantic analysis of present-day English (henceforth PresDE) texts. In this paper, we report on the feasibility of (re)training the USAS system to cope with English from earlier periods, specifically the Early Modern English (henceforth Emo...

متن کامل

Comparing and combining a semantic tagger and a statistical tool for MWE extraction

Automatic extraction of multiword expressions (MWEs) presents a tough challenge for the NLP community and corpus linguistics. Indeed, although numerous knowledge-based symbolic approaches and statistically driven algorithms have been proposed, efficient MWE extraction still remains an unsolved issue. In this paper, we evaluate the Lancaster UCREL Semantic Analysis System (henceforth USAS (Rayso...

متن کامل

The ACAMRIT semantic tagging system

Building on a successful previous project, UCREL (the University Centre for Computer Corpus Research on Language) is collaborating with Reflexions Communication Research (a market research company in London, UK) to develop software which will undertake the semantic tagging of words in a text, facilitate the assignment of 'content tags' to those words, and provide a statistical analysis of the r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004